Correlation Pursuit: Forward Stepwise Variable Selection for Index Models.

نویسندگان

  • Wenxuan Zhong
  • Tingting Zhang
  • Yu Zhu
  • Jun S Liu
چکیده

In this article, a stepwise procedure, correlation pursuit (COP), is developed for variable selection under the sufficient dimension reduction framework, in which the response variable Y is influenced by the predictors X(1), X(2), …, X(p) through an unknown function of a few linear combinations of them. Unlike linear stepwise regression, COP does not impose a special form of relationship (such as linear) between the response variable and the predictor variables. The COP procedure selects variables that attain the maximum correlation between the transformed response and the linear combination of the variables. Various asymptotic properties of the COP procedure are established, and in particular, its variable selection performance under diverging number of predictors and sample size has been investigated. The excellent empirical performance of the COP procedure in comparison with existing methods are demonstrated by both extensive simulation studies and a real example in functional genomics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Note for Alan’s Class

Wavelets have proven to be immensely useful for signal analysis and representation [7]. Various dictionaries of wavelets have been designed for different types of signals or function spaces [3, 13]. Two key factors underlying the successes of wavelets are the sparsity of the representation and the efficiency of the analysis. Specifically, a signal can typically be represented by a linear superp...

متن کامل

Prediction of the adsorption capability onto activated carbon of liquid aliphatic alcohols using molecular fragments method

Quantitative structure-property relationship (QSPR) for estimating the adsorption of aliphatic alcohols onto activated carbon were developed using substructural molecular fragments (SMF) method. The adsorption capacity of activated carbon (gr/100grC) for 150 aliphatic alcohols onto activated carbon (AC) is studied under equilibrium conditions. Forward and backwards stepwise regression variable ...

متن کامل

Forward Selection and Estimation in High Dimensional Single Index Models

We propose a new variable selection and estimation technique for high dimensional single index models with unknown monotone smooth link function. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection. In this article, we propose a ...

متن کامل

Greedy and Relaxed Approximations to Model Selection: A simulation study

The Minimum Description Length (MDL) principle is an important tool for retrieving knowledge from data as it embodies the scientific strife for simplicity in describing the relationship among variables. As MDL and other model selection criteria penalize models on their dimensionality, the estimation problem involves a combinatorial search over subsets of predictors and quickly becomes computati...

متن کامل

Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality.

OBJECTIVES Automated variable selection methods are frequently used to determine the independent predictors of an outcome. The objective of this study was to determine the reproducibility of logistic regression models developed using automated variable selection methods. STUDY DESIGN AND SETTING An initial set of 29 candidate variables were considered for predicting mortality after acute myoc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of the Royal Statistical Society. Series B, Statistical methodology

دوره 74 5  شماره 

صفحات  -

تاریخ انتشار 2012